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Abstract. Components of complex systems are often classified according to the 
way they interact with each other. In graph theory such groups are known as 
clusters or communities. Many different techniques have been recently proposed 
to detect them, some of which involve inference methods using either Bayesian 
or Maximum Likelihood approaches. In this article, we study a statistical 
model designed for detecting clusters based on connection similarity. The basic 
assumption of the model is that the graph was generated by a certain grouping 
of the nodes and an Expectation Maximization algorithm is employed to infer 
that grouping. We show that the method admits further development to yield 
a stability analysis of the groupings that quantifies the extent to which each 
node infiuences its neighbors group membership. Our approach naturally allows 
for the identification of the key elements responsible for the grouping and their 
resilience to changes in the network. Given the generality of the assumptions 
underlying the statistical model, such nodes are likely to play special roles in the 
original system. We illustrate this point by analyzing several empirical networks 
for which further information about the properties of the nodes is available. The 
search and identification of stabilizing nodes constitutes thus a novel technique to 
characterize the relevance of nodes in complex networks. 



1. Introduction 

Networks are useful tools to characterize complex systems [T1[5J[31|31[51|H1[71IH]- The 

system components are represented as nodes and their mutual interactions as edges. 
Finding structures in such networks is therefore of great relevance for understanding 
the mechanisms that underlie the system evolution. This explains the increasing 
interest in the topic, particularly in the detection of communities pQ [H |U [HI UHl 
[TTl [T^ [T51 UM [T51 [TB] , Communities are groups of nodes with a high level of 
group inter-connection [J- They can be seen as relative isolated subgraphs with few 
contacts with the rest of the network. Communities have an obvious significance for 
social networks where they correspond to groups of close friends or well-established 
teams of collaborators [6 . However, they are also important for characterizing other 
real-world networks such as those coming from biology [31 21 [7] or from technology 
and transport |171 1181 119| . Communities are not the only meaningful structures in 
networks: in Ecology, Computer and Social Sciences structurally equivalent nodes have 
been also considered [3 |6l [8] . These nodes are characterized by similar connectivity 
patterns and are expected to play similar roles within the system. 
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There has been a long tradition of applying Bayesian and Maximum Likelihood 
methods to structure detection in networks [2nil2Il[21ll31IMll2Sl[inil271IMj- These 
methods have the advantage that, depending on the statistical model used, they 
can be very general detecting both communities and structural equivalent set of 
nodes. The drawback, shared with many other methods, is that structure detection 
usually implies computational expensive exploration of the solutions maximizing the 
posterior probability or the likelihood. Recently, a maximum likelihood method that 
considers node clustering as missing information and deals with it using an Expectation 
Maximization (EM) approach has been introduced by Newman and Leicht [TJ [2] . This 
method is computationally less costly to implement and we will denote it by the 
acronym NL-EM from now on. NL-EM is able to identify network structure relying 
on three basic assumptions: (i) the actual connectivity of the network is related to a 
coherent yet a priori unknown grouping of the nodes, (ii) the presence or absence of a 
link is independent from the other links of the network and {Hi) the groups are tell-tales 
of processes that gave rise to the graph. No extra information is assumed except for the 
network itself and the number of groups. Under these assumptions, the method infers 
the classification of nodes that most likely generated the graph detecting communities 
and also structurally equivalent sets of nodes [2] ■ Here we will show that due to the 
simple structure of the NL-EM likelihood, its classifications are based on a subset 
of nodes which turn out to be responsible for establishing the group memberships of 
their neighbors. We are able to rank the nodes according to the amount of group- 
allocation information they transmit to their neighbors and thereby identify those 
that are essential for establishing each group. These nodes, which we will refer to 
as stabilizers, constitute the backbone of the classification: the classification would 
not be viable without them and conversely, stabilizers turn out to emerge as a result 
of their distinct connection patterns on the given graph. Given the generality of the 
NL-EM underlying assumptions and that the resulting classifications can be validated 
by comparison with other clustering methods, we suggest that the stabilizers have 
an important inherent value for understanding the processes that generated the given 
network. Such an expectation is supported by our results on empirical graphs for 
which additional information regarding the nodes intrinsic properties is available. We 
will also briefly discuss the extension of this concept to other inference methods such 
as Bayesian clustering techniques [22l |24] . 

2. NL-EM clustering method 

We begin with a quick summary of NL-EM as applied to graphs. Labeling the nodes 
as i = 1, • • • , A^, the variables are: tt^, the probability that a randomly selected node is 
in group r, 9rj, the probability that an edge leaving group r connects to node j, and 
Qir, the probability that node i belongs to group r. The method is a mixture model 
where an edge between nodes i and j (expressed as i O j) given the groups of i and 
j {gi and gj) is observed with probability 

PT{i^ j\g,,gj)=eg^jeg^,. (1) 

The edges are considered as independent so the probability that a given grouping 
realizes an observed network Q can be written as 



Pr(g|0,^,{g,})-[]^3, 



(2) 
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where Vi is the set formed by the neighbors of node i. 

The group assignment captured by the terms Qir is treated as missing information. 
The Expectation step of EM can thus be implemented as an average over the fog- 
hkehhood 



In TTr 



The maximization of £(7r, 9) is subject to the normahzation constraints, 

Orj = 1 and TT,. = 1 , 



(3) 



(4) 



and leads to 



iV Si 



(5) 



where kj is the degree of node j. The group assignment probabilities q are determined 

a posteriori from 

Pr(a,5, = r|0,^) 



as 



Pr(e|0,7r) 



(6) 



(7) 



The maximization of C can be carried out with different techniques. In order to 
account for the possible existence of a rough likelihood landscape with many local 
extrema, we employed an algorithm that alternates between simulated annealing and 
direct greedy iteration of Eqs. ^ and ([t]). 



3. Stability analysis and stabilizers 

The group membership of the nodes is encoded by the probabilities q. It is thus natural 
to ask for the conditions on a node i and its neighbors to have i crisply classified into 
a single group r so that qis — Srs- The answer highlights the role of the neighbors 
in establishing a node's membership credentials. Looking at the expression for qir, 
Eq. 0, 

qir ~ Y\. ^^j' 

where the non-zero prefactors whose sole role is to ensure proper normalization have 
been suppressed, one finds that for each group s ^ r there must be at least one 
neighbor j of i whose probability 9sj is zero. However, as seen from Eq. ([s]), whether 
9sj is zero or not for some group s depends in turn on the group memberships of 
the neighbors of j. Hence having a node crisply classified as belonging to a group 
sets strong constraints on its neighbors and their respective neighborhoods. These 
constraints propagate throughout the network during the NL-EM iteration until a final 
configuration for 9 and q is established. In this sense, a node j is passing information 
about group membership to its neighborhood through the probabilities 9sj- This 
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information is negative, of the form "you do not belong to group X" when 9xj is zero 
and we say that node j stabihzes its neighbors against membership in group X. It 
is worth noting the parallels of this mechanism with message passing algorithms (291. 
In a classification into Afc groups each crisply classified node i must be stabilized 
against Afc — 1 groups. Thus one can regard the number of groups a node j stabilizes 
against as a measure of the amount of information Ij that j passes to its neighbors. 
If Ij = Afc ~ 1; node j can stabilize its adjacent nodes alone providing thus complete 
information about their group membership. On the other hand, when Ij < Afc — 1, i 
provides only partial information. The crisp classification of a neighbor i requires then 
the combined action of other adjacent nodes in order to attain full group membership 
information. We denote as stabilizers of i the union set of neighbors that alone or 
in combined action pass essential information to i establishing its membership in 
a single group (a more precise definition will be given below). The above analysis 
implies that any crisply classified node must be stabilized by one or more stabilizers. 
Therefore, if the assumptions of the statistical model are justified and the resulting 
node classification is meaningful, the identification of the corresponding stabilizers 
may offer useful additional information. 

Based on their classification and the information passed, four types of nodes can 
be distinguished: nodes can be strong or weak depending on whether they are crisply 
classified into a single group or not, and they can be stabilizers or not, depending 
on whether they pass essential information for establishing an adjacent node's group 
membership. If we consider a node i and denote by ai = {r'|6'ri = 0}, the set of groups 
that i does not connect to, and by Ci — {r\qir — 0}, the set of groups that i does not 
belong to, the NL-EM equations ([s]) andQ relate these sets as follows: 



forming a set of consistency relations with a simple meaning: a node cannot belong 
to a group to which its neighbors do not connect, and the common set of groups to 
which a node's neighbors do not belong must correspond to the groups that it does 
not connect to. If we require in particular that a node i is strong, i.e. it is crisply 
classified as belonging to a particular group A, then Ci —C \ {A} [5U] . 

Given the sets aj associated with the neighbors j of a strong node i, not all 
adjacent nodes need to contribute to its full stabilization. Likewise, node i can be 
stabilized by different combinations of its neighbors' sets aj. This is best illustrated 
by an example shown in Fig. [l] Suppose that the groups are C = {A, B, C, £), E} and 
let us assume that node i is crisply classified as E. Let i have four neighbors with 
corresponding sets ai = {^, i3,C}, a2 = {A,_D}, = {B,C} and 0-4 = {A}. It is 
clear that all four nodes together must stabilize i, as otherwise i would not be a strong 
node. However, the sets of neighbors {1,2} or {2,3} each suffice to stabilize node i. 
The node 4 is redundant, since it does not contribute a new class against which 2 or 
3 are not already stabilizing. In other words, if the set {2,3,4} is considered, node 4 
can be removed without altering the stabilization of i. The same is not true for the 
nodes 2 and 3. The notion of stabilization sets and stabilizer nodes can be defined as 
follows: A subset of nodes adjacent to i is a stabilization set of i, if the removal of any 
one of the nodes from the set causes i not to be stabilized by that set anymore. A 
node j is a stabilizer if it is member of at least one stabilization set. The definition of 
stabilizer involves thus a stabilization relation with at least one of the node neighbors. 
In the above example, 1, 2 and 3 are the only stabilizers of i. Non-stabilizer nodes can 




and 




(9) 
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aj={A^,C} a,= {A4)} 




03={B,C} a,= {A} 

Figure 1. Stabilization of a strong node i. The groups are C = {A, B, C, D, E} 
and i is crisply classified as E. The four adjacent nodes are shown along with 
the set of classes a to which they have no connections to. In order for i to be 
classified as E there must be a subset of adjacent nodes such that the union of 
their corresponding ct is C — {E}, Eq. (j9]l. All four adjacent nodes must stabilize 
i, as otherwise i would not be a strong node. However, the sets {1,2} or {2,3} 
each suffice to stabilize node i. The node 4 is redundant, in the sense that any 
stabilization of i involving node 4 remains a stabilization of i when 4 is removed. 
A precise definition of stabilizers is given in the text. 



be removed without affecting stabilization, while whenever a stabilizer is removed the 
number of ways in which a given node is stabilized decreases. In the example of Fig. [TJ 
the removal of node 2 would cause complete loss of stabilization of i, while removal of 
3 or 1 would leave i with only a single stabilization. It can be shown that the removal 
of a stabilizer will never turn a previously non-stabilizer node into a stabilizer, but it 
might turn some stabilizers into non-stabilizers. Note that in a sense stabilizer 2 is 
more important than 1 or 3, since it is part of every stabilization of i and its removal 
will thus render i a weak node. In fact, one could attach a strength to each stabilizer 
by keeping track of the number of stabilizations in which it is involved, but, for sake 
of simplicity, we will not pursue this here. 

Given an NL-EM classification with strong nodes, we can immediately identify 
the stabilizers that are responsible for the crisp classifications. Details on how to 
implement the identification of stabilizers are provided in Appendix A. 2. The relation 
i stabilizes j induces a directed subgraph on the original network and we will refer 
to this as the stabilizer subgraph. The relation between two stabilizer nodes is not 
necessarily of mutual stabilization: a necessary condition for adjacent strong nodes 
i and j to mutually stabilize each other is that both ai D Cj and aj D Ci are empty. 
The connections among strong stabilizers capture the relations between groups in 
the graph. In that sense one can regard the stabilizers as exemplary members of 
the groups. In the undirected graphs of Figs. [2] -[5] the stabilizer subgraph has been 
superposed. The extension of these concepts to NL-EM classifications in directed 
graphs is similar, details are given in Appendix B. 
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The case of NL-EM classifications into two groups is particularly simple. Denoting 
the groups as A and A, a crisply-classified (strong) node belongs to either A or A and 
a strong node of a given group has to be stabilized against the complementary group. 
All nodes with non-empty a are therefore stabilizers, and if more than one is present all 
are equivalent, each stabilizing a given node independently from the other stabilizers. 
Moreover, the strong stabilizers are nodes that are stabilized themselves by some of 
their neighbors which necessarily are also stabilizers. The conditions of Eq. ^ permit 
only two possible configurations of the stabilizer subgraphs. Either strong stabilizers 
of group A connect to strong stabilizers of their own group, or stabilizers of group A 
connect to those of the complementary group A. In the former case we get a disjoint 
community like partition (c/. Fig. |4]) of the stabilizer graph, whereas in the latter 
case we obtain a bipartite partition (c/. Fig. [5|. 

Furthermore, the NL-EM classification into two groups reveals a simple but 
meaningful hierarchical structure in the way the different type of nodes in the 
classification relate. Strong (non-stabilizer) nodes are nodes for which ct = 0, so 
these nodes connect to nodes of both groups (weak or strong) , however in order for 
them to be strongly classified as in one group, let us say, A (A) they can only connect 
to those stabilizer nodes with the compatible stabilizer classes a = {A} {a = {A}). In 
turn, the neighborhood of strong stabilizer nodes with a — {A} or a — {A} can consist 
only of nodes strongly classified as A or A, respectively. The weak stabilizer nodes 
are by definition nodes for which c = 0, but for which a = {A} or a = {A}. Thus 
weak stabilizer nodes cannot connect to strong stabilizer nodes, but they can stabilize 
strong (non-stabilizer) nodes. Finally, the weak nodes that are neither strong nor 
stabilizing can connect to strong non-stabilizing nodes and other weak nodes. In this 
way the connection rules for the strong stabilizers, weak stabilizers, strong nodes, and 
weak nodes set up a hierarchy of nodes at the core of which are the strong stabilizers. 

4. Stabilizers in a benchmark 

As we observed in the previous section, a node can be stabilized by its neighbors 
in multiple ways. This redundancy renders classifications robust against disorder 
introduced by the addition or removal of edges up to a certain point. To illustrate this 
we consider a benchmark with four communities jllj . The initial network is generated 
with four disjoint groups of 32 nodes each, with the nodes having on average (kin) — 16 
in-group links. These groups correspond to the four clusters of Fig.|2][A)-(D). Random 
links connecting different groups are added to the basic configuration and the number 
of stabilizers are tracked as a function of the average number of out-group links kout ■ 
Fig.[2]shows the stabilizers obtained from an NL-EM classification into J\fc = 4 groups 
at disorder level kout = 0.5,6.0,8.3 and 15.3. 

When kout = we find a crisp classification where all nodes are strong stabilizers, 
meaning that all nodes stabilize and are being stabilized. Furthermore, all of them 
provide complete stabilization information, / = 3, with a single stabilizer sufficing to 
crisply classify a neighbor. Since {km) — 16, there is on average 16-fold redundancy 
in the stabilization of each node. As random connections are added to the network, 
the four clusters become connected with each other. Some of the stabilizers start 
to stabilize against fewer classes, giving rise to a decrease in the average /. In the 
right panel of Fig. [3] we have plotted how the average stabilization information decays 
when kout increases. In order for nodes with / < 3 to be stabilizers they have to act 
in combined action with other nodes, as in the example of Fig. [ij Thus an increase of 
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ko„t= 0.5 




(B) 

kout = 6.0 





Figure 2. Benchmark with four communities and increasing intra-community 
connections, kout- (A), (B), (C) and (D): four instances of the graph classification: 
strong stabihzers, weak stabihzers, strong nodes and weak nodes are shown as 
rhomboids, cubes, spheres and cones, respectively. Nodes have been arranged by 
communities while their color depicts the group to which the NL-EM algorithm 
assigns them with highest probability. The directed (dark) arcs show the 
information flow, as captured by the stabilization relation. 



the level of disorder kout causes both a reduction in the redundancy of stabilizations of 
strong nodes and a shift towards stabilizations by combined action of more than one 
stabilizer. The increase in disorder eventually leads to a loss of strong nodes, implying 
that the classification deteriorates. In order to assess the quality of classifications, we 
use the entropy Sq, as defined in [2] 

Sq ^ --^^qirlnq„- (10) 

ir 

The entropy Sq measures the crispness of a classification. When Sq = 0, all the nodes 
are strong, while 5*^ = \n{J\fc) corresponds to case where the classification of the nodes 
is maximally uncertain. The right panel of Fig. [3] displays Sq as a function of kout, 
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Figure 3. Benchmark with four communities and increasing intra-community 
connections continued. (Left) Number of nodes vs. disorder: strong (diamonds) 
and weak (boxes) stabilizers, strongly (circles) and weakly classified nodes 
(triangles). (Right) average information (/) passed by all nodes (continuous green 
curve) or stabilizers only (green dashed curve, for which / > 1 by definition). The 
blue curve shows the entropy of the classification, as defined in Eq. | |10| l. The 
values for each data point in the plots have been obtained from averaging over 
100 realizations of the random process of edge additions. 



showing that the crispness of the classification is lost for large fcout- 

The increase in entropy is closely related to what happens to the different nodes 
in the classification as edges are added, particularly to the stabilizers. The variation of 
the number of the different type of nodes with kout is shown in the left panel of Fig. [3] 
As the addition of new edges progresses, some nodes cease to be strong stabilizers. 
When a node is not a strong stabilizer anymore, it can still remain strong as long as 
there are other nodes stabilizing it in its neighborhood. As can be seen in the left panel 
of Fig. [3| this is what is happening up to kout ^ 4: The number of strong stabilizers 
decreases while the number of strong nodes rises accordingly. Therefore, initially the 
effect of adding edges is to convert strong stabilizers into strong nodes. Most of the 
nodes remain strong (stabilizer or not), and the classification is essentially crisp with 
an entropy Sq « 0. With the further addition of edges, the number of strong nodes 
starts to decrease as a result of the loss of stabilization, giving rise to the appearance 
of weak stabilizing and non-stabilizing nodes at kout ^ 4. Continuing to kout ~ 10, the 
entropy of the classification remains very low because there still is a sizable number 
of strong nodes supported by a few weak and strong stabilizers (see panels B and C in 
Fig. [2]). As further edges are added, the number of weak stabilizers starts to decrease 
as well, and eventually most of the nodes are weak and non-stabilizing, accounting for 
the quick rise in the classification entropy Sq starting around kout ~ 10. 

5. Real- world networks 

We focus now on some empirical examples to show the special role that the stabilizers 
play in a classification and the type of information that they convey while also 
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Figure 4. Political affinity network between congressmen of the 109th US Senate. 
Right: The node shapes are as in Fig. 1: rhomboids are strong stabilizers and 
spheres well classified nodes not passing essential group information. The colors 
convey the NL-EM classification which follows the partition into Democrats (blue) 
and Republicans (red) fields. Left: the senators are displayed as ranked according 
to their liberal-conservative score 1341 . The average values of the score in 
the different sub-groups are: Rep. stabilizers 0.45 it 0.08, Rep. strong nodes 
0.33 ± 0.12, Dem. strong nodes -0.22 ± 0.08 and Dem. stabilizers -0.37 ± 0.06. 



highlighting the versatiUty of our analysis. As explained, classifications into two groups 
are particularly simple and in this case the stabilizers can be easily identified once a 
solution of the NL-EM clustering is given. This simplicity makes them good candidates 
to illustrate the properties of the stabilizers. We present first two examples of this 
type that show the role of the stabilizers and the relations between them. We then 
turn to a directed network with a classification into 4 groups in order to illustrate a 
more general situation. 

The first example is a network built from the voting records of the 109th US Senate 
[5T| . The nodes represent senators that served the full two year term (2005 — 2007) 
during which 645 issues were voted. Since our aim is to construct a network based on 
political affinity, we draw an edge between two senators if they voted in the same way 
at least once. The edges are weighted by the reciprocal of the number of co-voting 
senators minus one, a common practice for collaboration networks [32 . In this way, 
an agreement in minority on an issue has a higher value than that in an unanimous 
vote, differentiating more clearly close political standings. Due to circumstantial 
quasi-unanimous votes, the network is initially close to fully connected. A threshold 
such that edges with lower weights are removed can be introduced, and the resulting 
networks can be analyzed as the threshold increases. We have applied two-group NL- 
EM to these networks. Once the threshold is high enough, the clusters found follow 
well the divide between Democrats and Republicans. The instance in which about 
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Figure 5. Stabilizer analysis for the noun-adjective network in David Copperfield 
1371 . The links represent words appearing in juxtaposition, while the superposed 
directed links indicate the stabilization relations. The NL-EM class assignment 
correlates strongly with the word being a noun (green) or an adjective (red). On 
the right, the subgraph formed by the strong stabilizers that exhibits a strict 
bipartite ordering. 



half of the senators, either Repubhcans or Democrats, are stabihzers is displayed 
in Figure [4j Congress roll calls and their derived networks have been extensively 
studied in the literature [33l |34l [35l |36] . One of the most interesting results is that 
single votes of a representative can be understood with a low dimensional spatial 
model (DW-NOMINATE [331 131]) in which a set of coordinates can be assigned to 
each congressman characterizing his/her political stand on the different issues. Since 
the 90's the number of dimensions required has been reduced in good approximation 
to only one that strongly correlates with the congressman's view on socio-economic 
questions (liberal vs. conservative) [SSlEl]. In Fig[4j we show the relation between 
being a stabilizer and the location in the liberal-to-conservative dimension. The 
stabilizers tend to be the most radical members of the Senate who are probably defining 
the overall position of their groups. This exercise can be repeated on networks obtained 
with different thresholds. It can be seen that as the threshold increases more and more 
nodes turn into stabilizers. Keeping track of the senators that become stabilizers at 
different thresholds allows for a refined exploration of the political spectrum. Note in 
particular that the above results have been obtained by simply looking at the co-voting 
relation and without considering the vote records in detail, i.e., the actual issue put 
to vote. 

In our second example we show how by extracting the sub-graph of stabilizers 
we can obtain from its structure useful information about what features distinguish a 
stabilizer node and how the groups relate in a classification. We consider a semantic 
network in which the nodes are adjectives and nouns occurring most frequently in 
Charles Dickens' novel David Copperfield [37]. A relation between any two of these 
words is established if they occur in juxtaposition. In Fig. [5] we have represented the 
network, the best NL-EM partition in two groups and identified the types of nodes. 
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Figure 6. Number of strong and weak stabilizers vs number of groups Afc as 
obtained from the NL-EM classification of the Little Rock Lake foodweb [38) . The 
maximum number of strong stabilizers occurs around Afc = 3, 4. This number is 
close to the trophic level which is around 4, suggesting that a classification into 4 
groups might capture the trophic levels. 



There turn out to be two sub-groups containing nouns or adjectives only that are 
strong stabihzers. These two sub-groups bear the responsibihty for the classification 
of remaining words by association. Note that the only input to the NL-EM method 
is the network. We are not introducing any bias for the partition in adjectives and 
nouns. Most of the remaining words are well classified. The stabilizers, central to 
establishing the classification, are the words always occurring in strict combinations 
like true friends, never mixing with members of the same group and they form a 
bi-partite sub-graph of stabilizers as shown in the right panel of Fig. [5] Conversely, 
nonstabilizing nodes are words appearing in mixed roles, such as the word little in the 
adjective-adjective-noun triplet poor little mother. 

Our final example, showing a more general case with 4 groups, is the Little Rock 
food-web. The vertices of this network are species living in the aquatic environment of 
Little Rock Lake in Wisconsin '38' . Each directed link represents a predation relation 
pointing from predator to prey. The number of trophic levels is around four ^39) and 
turns out to be the number of groups for which the NL-EM algorithm produces a 
partition with highest abundance of strong stabilizers, as shown in Fig. [6] where we 
have plotted the number of stabilizers of an NL-EM solution against the number of 
groups Afc- A property of the four group classification depicted in Fig. [7] is that it 
keeps basal species (green) in one group, top predators (cyan) in another, and assigns 
the rest to two different groups based on the prey they feed on at the basal level. 
The species that are not strong stabilizers, for instance nodes 11, 61 or 80, could be 
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related to a missing data problem. In the case of 61 (Hydroporus) or 80 {Lepidoptera 
Pyralidae), the species appear only as prey having no connection to lower levels. 
However, its consumers are not typically feeding on basal species, they are "cyan", 
and this results in an NL-EM classification that assigns them into the "red" group. 

As seen in Fig. [tJ^A), most of the species of the network are strong stabilizers. 
Their abundance is a direct result of the highly structured organization of the foodweb: 
similar species have similar prey which, as our analysis shows, is also linked to their 
trophic levels (see Fig. [7]B and C). Or more correctly, the consistent choice of species 
a predator does not prey on is what renders them stabilizers. The possibility of 
classifying species in low dimensional spaces depending on their trophic level and on 
the intervality of their prey distribution has been extensively discussed in the literature 
[H HOl im |42] . Our stability analysis reveals an underling structure in the connectivity 
pattern of the foodweb, which is responsible for the success of these low dimensional 
models. 

6. Discussion and Conclusions 

The maximum likelihood function upon which the NL-EM inference method is based 
is rather generic and depends on the assumption that nodes with similar connections 
should be grouped together. Using this likelihood function we were able to show that 
a subset of nodes, the stabilizers, associated with a given grouping play a central role 
as they form the backbone of the classification which could not be attained without 
them. The mathematical basis behind the concept of stabilizers is rather intuitive and 
follows from the product form of the group assignment probabilities, Qir, in Eq. Q, 
which is in turn a direct consequence of the assumption that the edges are statistically 
independent (Eq. ([2])). Such an assumption is common to a number of probabilistic 
clustering methods. We can rewrite Eq. ([t]) as 

Qir = Y\. ^r], (11) 

where 



(12) 



so that the prefactors are equally absorbed into 6j.j. Note that is in the interval 
[0,1]. Written in the above form it is clear that very small values of qir must 
arise from very small values of 9rj dominating the product. Likewise, we see from 
d\n Qir /d6rj — ^/drj that changes in these factors will have the greatest effect on the 
value of Qir- The stabilizers we have introduced here constitute the extreme case, 
namely the nodes j for which O^j = 0. As we have shown, this requirement together 
with the fact that O^j depends in turn on qir has allowed us to extract the stabilization 
rules for crisply classified nodes, qir — Sg-r- However, this concept could be relaxed to 
define stabilizers more generally by requiring only that 6rj < e with e appropriately 
chosen. 

It is possible to apply the notion of stabilizers to other probability models for node 
classification such as those considered by Airoldi et al. [21] or Nowicki and Snijders [22] . 
An inspection of Eqs. (2) and (3) as well as the equation for B{g,h) of [21] reveals 
a similar structure for the inter-relation between the edge-based class assignment 
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Figure 7. (A) Stabilizers for the best 4 group NL-EM classification of the Little 
Rock Lake foodweb I38| . Nodes are species and directed links correspond to 
predation relations. The node labeling follows 1381 . (B): fraction of species 
belonging to each group plotted against their prey-averaged trophic level (TL) 
and the standard deviation of TL of their preys, as defined in 1391 . The radius of 
the spheres is proportional to the log of the percentile. Spheres with two colors 
include species of more than one group (each sphere or half-sphere is independent). 
(C): averages of TL and cttl over the species forming each group. 



Stability of Maximum likelihood based clustering methods 



14 



probabilities and the class connection probability B{g,h), which are analogues of 
the probabilities q and 9. The variational Expectation Maximization approach of [Mj 
can also be applied to a model that was considered by Nowicki and Snijders |22) . 
which is more akin in spirit to the model presented here [J^. For both models, 
however the resulting rules of stabilization are rather involved due to the inclusion 
in the likelihood of the absence of edges, as well as due to the non-factorizable form 
of Pr(i -f-?- j\gi,gj) = vigi,gj) as compared with Eq. ([I]). The attractiveness of the 
probability model, Eqs. ([T]) and is that it delivers meaningful classifications despite 
of its simplicity, while at the same time the corresponding stabilization rules have a 
rather immediate interpretation, as we have shown in Sections 3 and 4. 

In summary, we have presented a general method for inferring information about 
which elements are most relevant in establishing group structures in a complex 
network. The maximum likelihood function upon which our inference is based is 
rather generic. This approach does not assume any additional a priori knowledge 
about the network, rendering it attractive in circumstances in which the available 
information about the nodes is limited. In particular, we have introduced the concept 
of stabilizers associated with a given NL-EM classification and shown that they play 
a central role in the network partition. If the stabilizers were removed from the 
network, the partition would lose its meaning. If on the other hand, the subgraph 
formed only by stabilizers is considered, the classification remain intact and useful 
information regarding the interaction between the different groups in the graph can 
be obtained. The stabilizers represent therefore the gist of a network partition. Their 
identification is highly useful in understanding the way in which the structure of 
complex systems form and their elements aggregate in clusters. This technique has a 
wide applicability as we have shown with three empirical examples of networks of very 
different origins: social sciences, semantics and ecology. In addition it raises several 
important questions, such as the role of these special nodes in the evolution of any 
dynamic process running on the graph such as the spreading of opinions, rumors or 
diseases, or even in the evolution of the graph itself if the network is dynamic. 
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Appendix A. Numerical implementation details 

Appendix A. 1. NL-EM algorithm 

Given a network, the search for classifications of the NL-EM algorithm was carried out 
using an algorithm that alternates between simulated annealing and a direct greedy 
iteration of Eqs. ^ and ([7|. The program was run with a set of 10 000 different initial 
conditions for 9 and vr for each value of the number of groups Afc- Once the algorithm 
converged to a stationary value of the likelihood function, the instance with the best 
£ was selected. 
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Appendix A. 2. Extraction of stabilizers 

We outline here the algorithm we have used to extract stabilizers from an NL-EM 
classification with strong nodes. Tliv probkuii of determining the set of stabilizers 
associated with a strong node is related to the set covering problem in Computer 
Science, which is NP-complete. If a strong node has s adjacent nodes with non-empty 
d", there are in principle 2'^ combinations that have to be checked for finding the 
fundamental sets leading to stabilizations. In practice, many of the combinations can 
be eliminated by observing that if, say, cti, CT2, . . . , have been selected as candidates 
for a stabilization, any a that is a subset of the union of the (j^'s is redundant and thus 
cannot be part of that stabilization. This is the main strategy of our algorithm. Also 
note that, if there are Afc number of classes the number of possible distinct stabilizer 
sets tj is S = 2-^'="'^. For small A/c, s can be larger than S so that there are duplicates 
which can be removed beforehand. 

We have used a recursive algorithm for detecting the stabilizers. We partially 
order the s sets ai by their size. Two binary arrays of size s, iSelected and iAvailable 
indicate the candidates already selected and those available for contribution to a 
stabilization, respectively. The classes against which the s nodes stabilize are coded 
in an s X (Ac — 1) binary array arrStab, where the non-zero elements of arrStab[j, *] 
indicate the classes against which node j is stabilizing. A recursively called subroutine 
PickNext(), givenin Fig. A.l in pseudo-code, performs the task of determining all 
stabilizations of a strong node, given the sets arrStab. In the algorithms we 
have assumed that there is already defined a procedure Wheie{List, Value) = 
(Pointer, N Found), which takes a list and returns the indices where the list element 
equals to Value along with the number of elements found N Found. Also in our 
notation when two lists are operated on term by term we denote this as NewList[*\ ^ 
ListOne{*] < Operator > ListTwo[*\, avoiding having to write out explicitly a loop 
over the operation on individual elements. 

Algorithm Al: Pi.cV.He:yit{iSeiectecl,iAvaxlabLe, arrStab) 

{iAvaiLabLe,NavaiLList) <- WHERE(i/l\/oi(.obte,l) 
for i <- to NavaiLList-1 do 

xSeLectedLocaL <- iSeLected 

lAvaiLabLeLocaL <- iAvaiLabLe 

iSeLecteclLocaL[iAvaiLList[i]] *- 1 

GettiextA\/ailable(,iSeLectedLocaL,iAvaiLabLeLocaL,iSeLectedLocaL[iAvaiLList[i]], arrStab) 
Picktiext(,iSeLectedLocaL , iAvai LabLeLocaL, arrStab) 

end 

comment : If we reach this stage, there is nothing else to select 

(iUsed.nUsed) *- WHERE(iSeLectedLocaL,l) 
iBit[*] «- arrStab[iUsed[0],*] 
for i «- to nUsed-1 do 

ieit[*] *- ieit[*] or arrStab[iUsed[i],*] 

{MissingCLass^Nmissing) *- WHERE({Bitj0) 

end 

comment : If this is not a stabilization, simply return. 

if Nmissing > 
return 

comment: We found a stabilization. Update Lists, etc. 



Figure Al. Pseudocode for algorithm PickNext. 
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Initially PickNext(iSelected, iAvailable, arrStab) is called with the binary arrays 
iSelected and iAvailable initialized to zero and one, respectively. The algorithm 
getNextAvailable (see Fig. A. 2) updates iAvailableLocal, the set of available 
stabilizers that can contribute to a stabilization after iS electedLocal[i] has been added. 

Algorithm A2: GetNextAvailable (iSetectedj iAvaiLabLe, iPointer ,arrStab) 

for i <- to xPointer do 
■iAvaiiabLe[i] <- 

end 



comment : Make sure we have something to select. 

(iWextjnWext) <- WHERE(i4i/ai/.ai)/.e,l) 
if nNext = 
return 



comment : Form the union of sets already selected. 

(iUsed.nUsed) <- WHERE(iSe/.ected,l) 
ieit[*] «- arrStabliUsedle],*] 
for 1 <- to nUsed-1 do 

iSit[*] «- i5it[*] or arrStab[xUsed[i],*] 

end 



comment : Now determine the stabilizers that contribute. 

for i <- to nNext-l do 

iDi//[*] «- iSit[*] - arrStab [iA/ext[i],*] 
{Diff.nDiff) <- WHERE ( iDi//, -1) 
if nDi// = 

iAvaiLabLe[iNext[i]] <- 

end 



Figure A2. Pseudocode for algorithm getNextAvailable. 



Appendix B. Extension of NL-EM to directed graphs and stabilization 



A generalization of NL-EM to directed graphs that preserves structural equivalence [5J 
|6l [8] was recently provided in our earlier work ^2,. We assume that given a node z, a 
link to a node j can be either out-going, in-going or bi-directional. We thus introduce 
the probabilities: 

• 9rj that a directed link leaving a vertex of group r connects to node j, 

• 9rj that a directed link pointing to a node in group r exists from j, and 

• 9rj that a bidirectional link exiting from group r connects to j, 
and construct the probability of realizing a directed graph Q as 



Pr(5,.g|7r,V,^,'^) =n 



n ^9=^^- n n 



.(B.l) 



Vi, Vi, and Vi are the set of adjacent nodes of i from which i receives an in-coming, 
out-going, and bi-directional link, respectively. 
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The likelihood can now be written as 



L iet^, jel^i jet^, 

which has to be maximized under the following constraint on the probabilities 9rj , 

+ ^r,t + '9r.^) = 1, (B.3) 

i 

implying that there is no isolated node. The probability tt^, that a randomly selected 
node belongs to group r, is again given by J2r ""r = 1- The final result is [2] 

'^r = -^'Yqtr, (B.4) 



(B.2) 







-MY 















(B.5) 



where fc*, A:° and fcf are the in-degree, out-degree and bi-directional degree of node i, 
respectively. 

These expressions have to be again supplemented with the self-consistent equation 
for Qir which now reads 

•(-—>■<->■ 
^^Uj^tl Hje^^ Srj rijgt* Orj ^^^^ 



Note that when we have only bi-directional links so that Vi=l^i= for all i, 
and it follows from Eq. (B.5| that 9rj = 0rj= 0. Thus we recover the undirected EM 
equations Eqs. ^ and ^ under the identification 9rj —9rj- 

Appendix B.l. Stabilization rules for directed graphs 

The case of directed graphs is similar to the undirected case with a few minor 
modifications. Given a NL-EM classification of a directed graph C/, we associate with 
each node i the following four sets: 

u a i — {r\ Ori— 0}, the set of groups that i does not have an out-going connection 
to, 

• a i — {r\ 0ri= 0}, the set of groups that i does not have an in-going connection 
to, 

<-> 

• Gi = {r\ Ori= the set of groups that i does not have an bi-directional 
connection to, 

• Ci — {r\qir = 0}, the set of groups that i does not belong to. 
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along with their complements, ai, cTi, ai, and q 



The NL-EM equations, Eqs. B.5 and B.6 relate the sets ai and Ci to each other 
as follows: 



U ^3 U ^3 U ^3 =Ci, 

je'^i jel^i j'^'vi 

n =^j' 



(B.7) 
(B.8) 
(B.9) 
(B.IO) 



Defining the set of all stabilizer classes associated with a node, irrespective of the 
directionality as 

CTi = CTi U CTj U (T-j, (B.ll) 

the stabilization condition for a node i becomes identical to the one for the undirected 
case. 



u ^ 



J — '^l- 



(B.12) 
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